Introduction

Acute Myeloid Leukemia

Acute Myeloid Leukemia is characterized by an increase in the number of myeloid cells in the marrow and the premature terminal of the maturation of these myeloid cells. This results primarily in hematopoietic insufficiency. DNA hypermethylation at gene promoters which leads to gene silencing is frequently observed.

Azacitidine is a DNA methyltransferase inhibitor (DNMTi) which has been used to treat AML patients. However, the inhibition is not permanent and DNA hypermethylation often resumes after treatment. Elucidating which genes are regulated by DNA methylation can inform further treatment strategies and provide a mechanistic understanding of the DNA demethylation in AML.

Study Design and Workflow Analysis

In this study, OCI-AML3 cells were treated with 5-azacytidine, with a paired control. The experiment was done in triplicates and then total RNA was sequenced.

The dataset was obtained from recount3 and differential gene expression analysis was done through the R package, DEseq2. Following which, gene set enrichment analysis (GSEA) was performed with the clusterProfiler package in R.

Data Quality Check

Principal Component Analysis

In PCA, a clustering method is applied to essentially check if the treatment condition is the main variable responsible for variation in the data and to check if our replicates within each condition is similar. From the figure below, when separated by their treatment condition on PC1, the samples notably clusters by treatment condition and tightly within their treatment groups. We can see that the treatment condition accounts for 97% (PC1) of the variation and there is minimal variation between technical replicates (1%; PC2) which is ideal.

Fig.1: PCA of samples

Fig.1: PCA of samples

Differentially Expressed Genes (DEGs) Analysis

Volcano Plot of DEGs

A volcano plot is typically used to quickly visualize the relationship between the p-value (or statisitcal significance) and log2 fold change in expression at the gene level. Using abritrary parameters ie. fold-change and p-value cut offs, we can visualize the proportion of genes that are signifcant genes and are up or down regulated.

In the figure below, we can see that following treatment with 5-azacytidine, there was not much significant DEGs and a large majority of the DEGs seem to be downregulated. The parameters used were: adjusted p-value <= 0.05 and absolute fold-change >=1.5.1

Fig.2: Volcano Plot of Differentially Expressed Genes in Treated vs Untreated Samples

Fig.2: Volcano Plot of Differentially Expressed Genes in Treated vs Untreated Samples

Heatmap of all DEGs

Through a heatmap we can visualize the differences in gene expression between samples of different treatment conditions and observe for conistency within the treatment group. In the figure below, a significant number of genes are downregulated upon the 5-azacytidine treatment when compared to the control.

Fig.3: Heatmap of differentially expressed genes

Fig.3: Heatmap of differentially expressed genes

Top 20 Over- and Under-expressed DEGs

We then looked at the top 20 over and under expressed genes, visuazlied through a heatmap.

The under expressed genes broadly contribute to tumour progression such as members of the MAGE protein family and **MMP13*.

Notable genes that were over expressed include CRLF2 which is involved in the development of the haemopoietic system. RTN4RL1 is involved in cell signalling in the nervous system. CPLX1 is involved in the final stages of exocytosis including synaptic vesicles.

Fig.4: Top 20 Over Expressed Genes

Fig.4: Top 20 Over Expressed Genes

Fig.5: Top 20 Under Expressed Genes

Fig.5: Top 20 Under Expressed Genes

Pathway Analysis

GSEA

Since small changes in a single gene expression can have cascading effects, it would be relevant to look at changes in entire pathways to provide a more holistic view. This is done through GSEA analysis.

The top over expressed pathways broadly belong to cell cycle and cell replication processes. Meanwhile, the top underexpressed pathways were notably linked to the immune response such as response to interferon gamma and response to viral infections, and phagocytotic processes.

include_graphics("top_GSEA_up.png")
Fig.6: Top 5 Over Expressed Pathways

Fig.6: Top 5 Over Expressed Pathways

include_graphics("top_GSEA_down.png")
Fig.7: Top 5 Under Expressed Pathways

Fig.7: Top 5 Under Expressed Pathways